Richie's B-Twin
Development Release 1

Copyright (C) 1998 by Richard Theil. All rights reserved.
Provided to you free of charge. Don't redistribute for cash.
WARNING: Pre-Alpha grade code. Read the Disclaimer below.

Introduction

B-Twin is a full text search engine, just like those we already got used to in the Internet (AltaVista, Hotbot, etc.) and will get used to in the forthcoming MacOS 8.5 "Allegro". I thought the BeOS should also have one, and here it is. A full text index works by traversing a hierarchy of documents to create special look-up tables of the words contained in these documents. It can then very quickly find out which documents contain these words and possibly use set-combining operations for queries containing multiple words.

Usage

Place the "B-Twin" application into the root folder of the hierarchy you want to index. The location of the documents to be indexed is hard linked to the location of the application. A good start might be "/boot/beos/documentation", where all the BeOS related documentation is concentrated. Launch the Application. Click on the "Make Index" button. B-Twin will then index all files it considers suitable that are inside the folder. It will display the names of the files being indexed and revert the status line to the about-notice when it is finished. This may take quite a few minutes, When this is done, B-Twin is ready for queries. You may want to install a link to it or rename it to "!B-Twin", so you can find and access it more easily.

Enter any words you want to search for into the input line and click the "GO!" button. You will immediately get a list of files that contain these words. All words you entered must be included in a file to show up. For your convenience, you may then drag any files to the program you like. Unfortunately, the tracker does not like to do drag&drop on applications here, but a NetPositive browser window is a good target for this (as it is anyway likely that you are querying a lot of HTML files).

Sum-Up and additional information

Entered words are AND-combined
Search is case insensitive
Words with one or two letters are ignored
Words that do not start with an alphabetic character are ignored
Files are indexed if their MIME-type contains "text" or "html", or they do not have one.
Index Files are called "!Twin.Data" and "!Twin.Data2".
Expect index sizes up to the size of the original data (less for larger hierarchies).

So what?

This is another attempt at seeing if there is reasonable demand for software within the BeOS community. This time not a castrated modeler for 3-D hacking nerds (which caused the overwhelming response of exactly one e-mail asking me how to code assembler in MW), but something that may be more widely useful. So, if you want to see things improving, drop me an e-mail at 'Richard.Theil@frankfurt.netsurf.de', Tell me how you like it, dislike it, what features you would like to be added, and what you would pay for a commercial-grade full-featured copy (or many of them, if you are reseller).

Features that I thought of

Radical speedup of index operations.
Allow use of boolean operations and wildcards (like AltaVista).
Allow GUI-less use from command line for use with CGI scripts on web sites.
Allow configuration of index locations and rules.
Watch file system nodes for differential updates in realtime.
Use Translation Kit for better extracting .
Wintel port. (Oops. That was a Freudian typo. I meant Intel, of course ;)

I may do the latter one for fun if I finally get x86-BeOS recognizing the CD on my Lintel box, but other than that I won't get up again without significant cash flow. No sex, no crime, no violence with that stuff, you know, boring to code... Doing flashy 3-D stuff is simply more joy these days, especially if I don't get paid (or laid) anyway.

[nota bene: Getting laid in the BeOS community turns out to be difficult, according to a recent survey I saw (97% male users), unless once considers greek style as a viable alternative. Compare this to the figures on another current computing offer... - - - iThink, iLove, i... YESSS, Fancyness DOES matter.]


Disclaimer

Use at your own risk or don't use. This is pre-alpha software. It does about what it should in my limited environment (PowerMac 8500, BeOS R3), but there are quite a few hacks and almost no error checking included. If you intend to use this software in countries that patent rights on ideas or algorithms, you are solely responsible for making sure you don't infringe any third party rights. This should not be a problem within the civilized parts of the European Union.